反事实解释(CES)是了解如何更改算法的决策的强大手段。研究人员提出了许多CES应该满足的Desiderata实际上有用,例如需要最少的努力来制定或遵守因果模型。我们考虑了提高CES的可用性的另一个方面:对不良扰动的鲁棒性,这可能是由于不幸的情况而自然发生的。由于CES通常会规定干预的稀疏形式(即,仅应更改特征的子集),因此我们研究了针对建议更改的特征和不进行的特征分别解决鲁棒性的效果。我们的定义是可行的,因为它们可以将其作为罚款术语纳入用于发现CES的损失功能。为了实验鲁棒性,我们创建和发布代码,其中五个数据集(通常在公平和可解释的机器学习领域使用)已丰富了特定于功能的注释,这些注释可用于采样有意义的扰动。我们的实验表明,CES通常不健壮,如果发生不良扰动(即使不是最坏的情况),他们规定的干预措施可能需要比预期的要大得多,甚至变得不可能。但是,考虑搜索过程中的鲁棒性,可以很容易地完成,可以系统地发现健壮的CES。强大的CES进行额外的干预,以对比扰动的扰动比非稳定的CES降低得多。我们还发现,鲁棒性更容易实现功能更改,这为选择哪种反事实解释最适合用户提出了重要的考虑点。我们的代码可在以下网址获得:https://github.com/marcovirgolin/robust-counterfactuals。
translated by 谷歌翻译
The evaluation of object detection models is usually performed by optimizing a single metric, e.g. mAP, on a fixed set of datasets, e.g. Microsoft COCO and Pascal VOC. Due to image retrieval and annotation costs, these datasets consist largely of images found on the web and do not represent many real-life domains that are being modelled in practice, e.g. satellite, microscopic and gaming, making it difficult to assert the degree of generalization learned by the model. We introduce the Roboflow-100 (RF100) consisting of 100 datasets, 7 imagery domains, 224,714 images, and 805 class labels with over 11,170 labelling hours. We derived RF100 from over 90,000 public datasets, 60 million public images that are actively being assembled and labelled by computer vision practitioners in the open on the web application Roboflow Universe. By releasing RF100, we aim to provide a semantically diverse, multi-domain benchmark of datasets to help researchers test their model's generalizability with real-life data. RF100 download and benchmark replication are available on GitHub.
translated by 谷歌翻译
Within the glassy liquids community, the use of Machine Learning (ML) to model particles' static structure in order to predict their future dynamics is currently a hot topic. The actual state of the art consists in Graph Neural Networks (GNNs) (Bapst 2020) which, beside having a great expressive power, are heavy models with numerous parameters and lack interpretability. Inspired by recent advances (Thomas 2018), we build a GNN that learns a robust representation of the glass' static structure by constraining it to preserve the roto-translation (SE(3)) equivariance. We show that this constraint not only significantly improves the predictive power but also allows to reduce the number of parameters while improving the interpretability. Furthermore, we relate our learned equivariant features to well-known invariant expert features, which are easily expressible with a single layer of our network.
translated by 谷歌翻译
在这项工作中,我们研究了沉重的尾部噪声下的随机亚级别方法的高概率边界。在这种情况下,仅假定噪声具有有限的方差,而不是次高斯的分布,众所周知,标准亚级别方法具有很高的概率边界。我们分析了投影的随机亚级别方法的剪裁版本,其中每当具有大规范时,亚级别估计值都会被截断。我们表明,这种剪裁策略既导致了许多经典平均方案的任何时间和有限的地平线界限。初步实验显示以支持该方法的有效性。
translated by 谷歌翻译
我们分析了一类养生问题,其中高级问题在于平滑的目标函数的最小化和下层问题是找到平滑收缩图的固定点。这种类型的问题包括元学习,平衡模型,超参数优化和数据中毒对抗性攻击的实例。最近的几项作品提出了算法,这些算法温暖了较低级别的问题,即他们使用先前的下级近似解决方案作为低级求解器的凝视点。这种温暖的启动程序使人们可以在随机和确定性设置中提高样品复杂性,在某些情况下可以实现订单的最佳样品复杂性。但是,存在一些情况,例如元学习和平衡模型,其中温暖的启动程序不适合或无效。在这项工作中,我们表明没有温暖的启动,仍然可以实现订单的最佳或近乎最佳的样品复杂性。特别是,我们提出了一种简单的方法,该方法在下层下使用随机固定点迭代,并在上层处预测不精确的梯度下降,该梯度下降到达$ \ epsilon $ -Stationary Point,使用$ O(\ Epsilon^{-2) })$和$ \ tilde {o}(\ epsilon^{ - 1})$样本分别用于随机和确定性设置。最后,与使用温暖启动的方法相比,我们的方法产生了更简单的分析,不需要研究上层和下层迭代之间的耦合相互作用
translated by 谷歌翻译
在这项工作中,我们提出了一批Greenkhorn算法的多压正规化最佳运输问题。我们的框架足够普遍,可以涵盖一些现有的案例,如烟囱和Greenkhorn算法,用于双边缘设置,(贪婪)多光线灯,用于多压最佳运输。我们提供完整的汇聚分析,这是基于具有贪婪控制的迭代BREGMAN投影(IBP)方法的属性。获得了迭代复杂性的全局的收敛性和显式界限。当专门提到上述算法时,我们的结果提供了新的见解和/或改善现有的。
translated by 谷歌翻译
现代应用要求机器人符合多个通常相互冲突的规则,并与其他代理商互动。我们将Posetal Games作为一类游戏,每个玩家通过部分有序的一组指标表达了对结果的偏好。这允许人们将每个玩家的分层优先级与环境的交互性质组合。通过语境化标准游戏理论概念,我们为参与者的偏好提供了两个足够的条件,以便在有限作用集中证明纯NASH均衡的存在。此外,我们在偏好结构上定义正式操作,并将其链接到游戏解决方案的细化,显示如何系统地缩小均衡集合。所提出的结果展示在驾驶游戏中,自主车辆从有限组轨迹中选择。结果证明了对每个玩家最小禁区的结果的可解释性。
translated by 谷歌翻译
We introduce a framework based on bilevel programming that unifies gradient-based hyperparameter optimization and meta-learning. We show that an approximate version of the bilevel problem can be solved by taking into explicit account the optimization dynamics for the inner objective. Depending on the specific setting, the outer variables take either the meaning of hyperparameters in a supervised learning problem or parameters of a meta-learner. We provide sufficient conditions under which solutions of the approximate problem converge to those of the exact problem. We instantiate our approach for meta-learning in the case of deep learning where representation layers are treated as hyperparameters shared across a set of training episodes. In experiments, we confirm our theoretical findings, present encouraging results for few-shot learning and contrast the bilevel approach against classical approaches for learning-to-learn.
translated by 谷歌翻译